38 research outputs found
Compressive Embedding and Visualization using Graphs
Visualizing high-dimensional data has been a focus in data analysis
communities for decades, which has led to the design of many algorithms, some
of which are now considered references (such as t-SNE for example). In our era
of overwhelming data volumes, the scalability of such methods have become more
and more important. In this work, we present a method which allows to apply any
visualization or embedding algorithm on very large datasets by considering only
a fraction of the data as input and then extending the information to all data
points using a graph encoding its global similarity. We show that in most
cases, using only samples is sufficient to diffuse the
information to all data points. In addition, we propose quantitative
methods to measure the quality of embeddings and demonstrate the validity of
our technique on both synthetic and real-world datasets
Development of the dynamic power system security and reliability assessment tools
The overall vulnerability of the electrical grid appears to be increasing. This is because of different factors including the liberalization of the electrical market, the growth of demand and the appearance of new energy sources with a random production (solar and wind for instance). Today, engineers are undergoing research to try and enforce the reliability of the grid and propose various topological and technological solutions... One problem they have is that itâs difficult to see if a configuration is better than another. The goal of this project is to develop a tool that can evaluate the security and reliability of various power systems and its topologies. Using this tool, should make it possible to evaluate if a given power system topology of is better than another. Actually, tools that are able to do this task already exist [1]. However, they work statically by doing load flow iterations. The improvement of this program would be to do the simulation dynamically. The criterion used to define if a configuration is secure and stable will be the loss of load probability(LLP). In order to evaluate this LLP, the first thing to do is to define all possible scenarios. Then, there are two ways to calculate: 1. The simulations of all scenarios are done. Then, the average loss of load is calculated while taking into account the scenario probability. 2. Only a part of the possible scenarios are chosen according to Monte Carlo Simulation (MCS) [2] [1]. The simulation is applied to the selected scenario. Then, the average loss of load can be calculated. The first method is the most reliable to evaluate the loss of load, but itâs not applicable to large grids because of the required time for simulation. Thus, in this work, the second method shall be applied . This report will focus on some major points of the tools. First it will show how the program uses the MCS method and how scenarios are chosen. It shall be assumed that the number of events in a scenario follow a Poisson distribution. If few-event scenarios appear more often, they shall have smaller effects. On the contrary, several-event scenarios are rarer, but have a more important impact. Then, this document will focus on how the simulation is done, particularly on Eurostag and the API used. The report will show the rules applied to disconnect lines, machines and loads during simulation, if the limits of the system are not respected. After this, the report will show how to average the loss of load and how to calculate the probability density function (PDF). In order to classify the result, the PDF of the simulation can be approximated by a sum of Gaussian PDF. This is called the Gaussian mixture method (GMM) [2] [1]. ] In the end, simulation results will be exposed. Because of the numerous hypothesisâ done, the tools will not be able to give the exact loss of load probability (LLP). The main function of the program will be to compare two different topologies and decide which one is best
DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applications
Convolutional Neural Networks (CNNs) are a cornerstone of the Deep Learning
toolbox and have led to many breakthroughs in Artificial Intelligence. These
networks have mostly been developed for regular Euclidean domains such as those
supporting images, audio, or video. Because of their success, CNN-based methods
are becoming increasingly popular in Cosmology. Cosmological data often comes
as spherical maps, which make the use of the traditional CNNs more complicated.
The commonly used pixelization scheme for spherical maps is the Hierarchical
Equal Area isoLatitude Pixelisation (HEALPix). We present a spherical CNN for
analysis of full and partial HEALPix maps, which we call DeepSphere. The
spherical CNN is constructed by representing the sphere as a graph. Graphs are
versatile data structures that can act as a discrete representation of a
continuous manifold. Using the graph-based representation, we define many of
the standard CNN operations, such as convolution and pooling. With filters
restricted to being radial, our convolutions are equivariant to rotation on the
sphere, and DeepSphere can be made invariant or equivariant to rotation. This
way, DeepSphere is a special case of a graph CNN, tailored to the HEALPix
sampling of the sphere. This approach is computationally more efficient than
using spherical harmonics to perform convolutions. We demonstrate the method on
a classification problem of weak lensing mass maps from two cosmological models
and compare the performance of the CNN with that of two baseline classifiers.
The results show that the performance of DeepSphere is always superior or equal
to both of these baselines. For high noise levels and for data covering only a
smaller fraction of the sphere, DeepSphere achieves typically 10% better
classification accuracy than those baselines. Finally, we show how learned
filters can be visualized to introspect the neural network.Comment: arXiv admin note: text overlap with arXiv:astro-ph/0409513 by other
author
On localisation and uncertainty measures on graphs
Due to the appearance of data on networks such as internet or Facebook, the number of applications of signal on weighted graph is increasing. Unfortunately, because of the irregular structure of this data, classical signal processing techniques are not applicable. In this paper, we examine the windowed graph Fourier transform (WGFT) and propose ambiguity functions to analyze the spread of the window in the vertex-frequency plane. We then observe through examples that there is a trade-off between the vertex and frequency resolution. This matches our intuition form classical signal processing. Finally, we demonstrate an uncertainty principle for the spread of the ambiguity function. We verify with examples that this principle is sharp for the extreme values of and emphasize the difference between the generalized graph ambiguity function and the classical one. We finish with demonstration of some Young and Hausdorff-Young like inequalities for graphs
Chebyshev polynomial approximation for transductive learning on graphs
The goal of transductive learning is to find a way to recover the labels of lots of data with only a few known samples. In this work, we will work on graphs for two reasons. First, itâs possible to construct a graph from a given dataset with features. The main assumption we make is that if two vertices are close (connected with a small weight), they will have similiar labels. Thus, graph theory allows us to solve lots of problems. Second, graph problems can be solved distributively. Imagine you have a sensors network with a limited transmission power. Each sensor can only communicate with its closest neighbours. Implementing a distributive algorithm allows you to compute the solution directly on the sensors and on special point. In that case you can with regression, filter the noise, evaluate a data even if a sensor is broken, detect if a sensor is broken and estimate the data to a point with no sensors. This could also be done centrally by a computer, but doing so all the data will require to be transfered to the computer which consume a lot of resources (here energy). In [1], Zhou et al. present a recursive algorithm to solve the transductive learning problem. This recursive algorithm can also be implemented distributedly. In this work we compare this algorithm with a Chebyshev polynomial approximation, which is presented in [2] and [3]. The main advantage is Chebyshev polynomial are his recursive properties. We will also study the stability of the different algorithms. The criteria to evaluate if an algorithm will be the communication cost, which is the number of messages exchanged between all the vertices. This work is divided into two main parts. In the first one, we consider general, the graph regression problem. This first part is again divided into 2 subproblems. The first one is a general regression problem with prior â„âxâ„2 (where x is the solution). After showing the solution, we talk about Chebyshev polynomial approximation and graph Fourier transform. We give some examples based on the Minnesota road graph. In the second subproblem, we consider ridge regression. We will study the convergence of our algorithm as a function of the basis. We also compare our algorithm with some known recursive ones and we calculate the impact of the sparsity (needed in order to make our algorithm efficient) on the answer. We have worked on different datasets, but mainly on the DELVE Boston housing. For the second main part, we consider classification. In this case, the answer (the labels) is not continuous anymore, but discrete. In this way, the computer has to take a decision. We show that we can consider this new problem as a one shot filter. Then we talk about the basis used for the graph Fourier transform. We also study how we can construct a graph and some practical cases based on several datasets (the most studied will be USPS). In that part, we compare our Chebyshev approximation to an iterative algorithm. The link between all the problems is the way we use Chebyshev polynomial approximation. We choose a basis and we make a graph Fourier transform. Then we approximate the transfer function in the Fourier domain. We end with an inverse transform. In fact we show that we can avoid the direct and inverse graph Fourier transform and compute the solution directly or (and) distributively
Designing Gabor windows using convex optimization
Redundant Gabor frames admit an infinite number of dual frames, yet only the
canonical dual Gabor system, constructed from the minimal l2-norm dual window,
is widely used. This window function however, might lack desirable properties,
e.g. good time-frequency concentration, small support or smoothness. We employ
convex optimization methods to design dual windows satisfying the Wexler-Raz
equations and optimizing various constraints. Numerical experiments suggest
that alternate dual windows with considerably improved features can be found